Fuzzy Rule Based Systems for Gender Classification from Blog Data
نویسندگان
چکیده
Gender classification is a popular machine learning task, which has been involved in various application areas, such as business intelligence, access control and cyber security. In the context of information granulation, gender related information can be divided into three types, namely, biological information, vision based information and social network based information. In traditional machine learning, gender identification has been typically treated as a discriminative classification task, i.e. it is aimed at learning a classifier that discriminates between male and female. In this paper, we argue that it is not always appropriate to identify gender in the way of discriminative classification, especially when considering the case that both male and female people are of high diversity and thus individuals of different genders could have high similarity to each other in terms of their characteristics. In order to address the above issue, we propose the use of a fuzzy approach for generative classification of gender. In particular, we focus on gender classification based on social network information. We conduct an experiment study by using a blog data set, and compare the fuzzy approach with C4.5, Naive Bayes and Support Vector Machine in terms of classification performance. The results show that the fuzzy approach outperforms the other approaches and is also capable of capturing the diversity of both male and female people and dealing with the fuzziness in terms of gender identification. Keywords—Data Mining; Machine Learning; Fuzzy Rule Based Systems; Text Classification; Gender Classification
منابع مشابه
On Mining Fuzzy Classification Rules for Imbalanced Data
Fuzzy rule-based classification system (FRBCS) is a popular machine learning technique for classification purposes. One of the major issues when applying it on imbalanced data sets is its biased to the majority class, such that, it performs poorly in respect to the minority class. However many cases the minority classes are more important than the majority ones. In this paper, we have extended ...
متن کاملOn Mining Fuzzy Classification Rules for Imbalanced Data
Fuzzy rule-based classification system (FRBCS) is a popular machine learning technique for classification purposes. One of the major issues when applying it on imbalanced data sets is its biased to the majority class, such that, it performs poorly in respect to the minority class. However many cases the minority classes are more important than the majority ones. In this paper, we have extended ...
متن کاملA QUADRATIC MARGIN-BASED MODEL FOR WEIGHTING FUZZY CLASSIFICATION RULES INSPIRED BY SUPPORT VECTOR MACHINES
Recently, tuning the weights of the rules in Fuzzy Rule-Base Classification Systems is researched in order to improve the accuracy of classification. In this paper, a margin-based optimization model, inspired by Support Vector Machine classifiers, is proposed to compute these fuzzy rule weights. This approach not only considers both accuracy and generalization criteria in a single objective fu...
متن کاملA Margin-based Model with a Fast Local Searchnewline for Rule Weighting and Reduction in Fuzzynewline Rule-based Classification Systems
Fuzzy Rule-Based Classification Systems (FRBCS) are highly investigated by researchers due to their noise-stability and interpretability. Unfortunately, generating a rule-base which is sufficiently both accurate and interpretable, is a hard process. Rule weighting is one of the approaches to improve the accuracy of a pre-generated rule-base without modifying the original rules. Most of the pro...
متن کاملUSING DISTRIBUTION OF DATA TO ENHANCE PERFORMANCE OF FUZZY CLASSIFICATION SYSTEMS
This paper considers the automatic design of fuzzy rule-basedclassification systems based on labeled data. The classification performance andinterpretability are of major importance in these systems. In this paper, weutilize the distribution of training patterns in decision subspace of each fuzzyrule to improve its initially assigned certainty grade (i.e. rule weight). Ourapproach uses a punish...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2018